Methods and apparatuses for video encoding and video decoding

ABSTRACT

Implementations are described for determining, for a block being encoded in a picture, at least one predictor candidate, determining for the at least one predictor candidate, one or more corresponding control point generator motion vectors, based on motion information associated to the at least one predictor candidate, determining for the block being encoded, one or more corresponding control point motion vectors, based on the one or more corresponding control point generator motion vectors determined for the at least one predictor candidate, determining, based on the one or more corresponding control point motion vectors determined for the block, a corresponding motion field, and encoding the block based on the corresponding motion field.

TECHNICAL FIELD

At least one of the present embodiments generally relates to, e.g., amethod or an apparatus for video encoding or decoding, and moreparticularly, to a method or an apparatus for selecting a predictorcandidate from a set of predictor candidates for motion compensationbased on a motion model such as, e.g., an affine model, for a videoencoder or a video decoder.

BACKGROUND

To achieve high compression efficiency, image and video coding schemesusually employ prediction, including motion vector prediction, andtransform to leverage spatial and temporal redundancy in the videocontent. Generally, intra or inter prediction is used to exploit theintra or inter frame correlation, then the differences between theoriginal image and the predicted image, often denoted as predictionerrors or prediction residuals, are transformed, quantized, and entropycoded. To reconstruct the video, the compressed data are decoded byinverse processes corresponding to the entropy coding, quantization,transform, and prediction.

A recent addition to high compression technology includes using a motionmodel based on affine modeling. In particular, affine modeling is usedfor motion compensation for encoding and decoding of video pictures. Ingeneral, affine modeling is a model using at least two parameters suchas, e.g., two control point motion vectors (CPMVs) representing themotion at the respective corners of a block of picture, that allowsderiving a motion field for the whole block of a picture to simulate,e.g., rotation and homothety (zoom).

SUMMARY

According to a general aspect of at least one embodiment, a method forvideo encoding is disclosed. The method for video encoding comprises:

determining, for a block being encoded in a picture, at least onepredictor candidate;

determining for the at least one predictor candidate, one or morecorresponding control point generator motion vectors, based on motioninformation associated to the at least one predictor candidate;

determining for the block being encoded, one or more correspondingcontrol point motion vectors, based on the one or more correspondingcontrol point generator motion vectors determined for the at least onepredictor candidate;

determining, based on the one or more corresponding control point motionvectors determined for the block, a corresponding motion field, whereinthe corresponding motion field identifies motion vectors used forprediction of sub-blocks of the block being encoded;

encoding the block based on the corresponding motion field.

According to another general aspect of at least one embodiment, a methodfor video decoding is disclosed. The method for video decodingcomprises:

determining, for a block being decoded in a picture, a predictorcandidate;

determining for the predictor candidate, one or more correspondingcontrol point generator motion vectors, based on motion informationassociated to the at least one predictor candidate;

determining for the block being decoded, one or more correspondingcontrol point motion vectors, based on the one or more correspondingcontrol point generator motion vectors determined for the predictorcandidate;

determining, based on the one or more corresponding control point motionvectors determined for the block, a corresponding motion field, whereinthe corresponding motion field identifies motion vectors used forprediction of sub-blocks of the block being decoded;

decoding the block based on the corresponding motion field.

According to another general aspect of at least one embodiment, anapparatus for video encoding is disclosed. Such an encoding apparatuscomprises:

means for determining, for a block being encoded in a picture, at leastone predictor candidate;

means for determining for the at least one predictor candidate, one ormore corresponding control point generator motion vectors, based onmotion information associated to the at least one predictor candidate;

means for determining for the block being encoded, one or morecorresponding control point motion vectors, based on the one or morecorresponding control point generator motion vectors determined for theat least one predictor candidate;

means for determining, based on the one or more corresponding controlpoint motion vectors determined for the block, a corresponding motionfield, wherein the corresponding motion field identifies motion vectorsused for prediction of sub-blocks of the block being encoded;

means for encoding the block based on the corresponding motion field.

According to another general aspect of at least one embodiment, anapparatus for video decoding is disclosed, wherein the decodingapparatus comprises:

means for determining, for a block being decoded in a picture, apredictor candidate;

means for determining for the predictor candidate, one or morecorresponding control point generator motion vectors, based on motioninformation associated to the at least one predictor candidate;

means for determining for the block being decoded, one or morecorresponding control point motion vectors, based on the one or morecorresponding control point generator motion vectors determined for thepredictor candidate;

means for determining, based on the one or more corresponding controlpoint motion vectors determined for the block, a corresponding motionfield, wherein the corresponding motion field identifies motion vectorsused for prediction of sub-blocks of the block being decoded;

means for decoding the block based on the corresponding motion field.

According to at least one embodiment, motion information associated tothe predictor candidate corresponds to a non affine motion information.In this way, prediction of affine motion model is improved by using nonaffine motion model. A non affine motion model is a translational motionmodel wherein only one motion vector representative of a translation iscoded in the model. The number of candidate predictors for predicting anaffine motion model is increased, thus improving compression efficiency.

According to at least one embodiment, the predictor candidate iscomprised in a set of predictor candidates and for the block beingencoded/decoded, an index corresponding to the predictor candidate inthe set of predictor candidates is encoded at the encoder or received bythe decoder.

According to at least one embodiment, determining for the predictorcandidate, one or more corresponding control point generator motionvectors, based on motion information associated to the predictorcandidate, comprises:

-   -   determining one or more corresponding control point associated        to the predictor candidate, based on motion information        associated to the predictor candidate,    -   determining the one or more corresponding control point        generator motion vectors from the one or more corresponding        control point associated to the predictor candidate.

According to this embodiment, an affine motion model is estimated forthe predictor candidate based on motion information associated to thepredictor candidate.

According to at least one embodiment, the one or more correspondingcontrol point generator motion vectors comprise a motion vector {rightarrow over (v₂)} of a top left corner of the predictor candidate, amotion vector {right arrow over (v₃)} of an above right corner of thepredictor candidate, and a motion vector {right arrow over (v₄)} of aleft bottom corner of the predictor candidate, and wherein the one ormore corresponding control point motion vectors for the block comprise amotion vector {right arrow over (v₀)} of a top left corner of the blockand a motion vector {right arrow over (v₁)} of an above right corner ofthe block, and wherein motion vector {right arrow over (v₀)} and {rightarrow over (v₁)} are determined by:

$\overset{\rightarrow}{v_{0}} = {\overset{\rightarrow}{v_{2}} + {\left( {\overset{\rightarrow}{v_{4}} - \overset{\rightarrow}{v_{2}}} \right)\left( \frac{Y_{curr} - Y_{neighb}}{H_{neighb}} \right)} + {\left( {\overset{\rightarrow}{v_{3}} - \overset{\rightarrow}{v_{2}}} \right)\left( \frac{X_{curr} - X_{neighb}}{W_{neighb}} \right)}}$${\overset{\rightarrow}{v_{1}} = {\overset{\rightarrow}{v_{0}} + {\left( {\overset{\rightarrow}{v_{3}} - \overset{\rightarrow}{v_{2}}} \right)\left( \frac{W_{curr}}{W_{neighb}} \right)}}},$

where Y_(curr), Y_(neighb) are respectively the vertical position of theblock and the predictor candidate in the picture, X_(curr), X_(neighb)are respectively the horizontal position of the block and the predictorcandidate in the picture, W_(curr) is the horizontal size of the blockand W_(neighb), H_(neighb) are respectively the horizontal and verticalsize of the predictor candidate.

According to at least one embodiment, the predictor candidate comprisesone or more sub-blocks, each sub-block being associated to at least onemotion vector, and determining for the predictor candidate, one or morecorresponding control point associated to the predictor candidate, basedon motion information associated to the predictor candidate comprisesdetermining one or more corresponding control point motion vectorsassociated to the predictor candidate, based on at least two motionvectors associated respectively to at least two sub-blocks of thepredictor candidate, and verifying that the one or more correspondingcontrol point motion vectors associated to the predictor candidatesatisfies an affine motion model.

According to this embodiment, determining one or more correspondingcontrol point associated to the predictor candidate, based on motioninformation associated to the predictor candidate is simple and does notdoes not imply high computations. According to this embodiment, it isverified that the motion model provided by sub-blocks of the predictorcandidate satisfies an affine motion model.

According to at least one embodiment, the predictor candidate comprisesone or more sub-blocks, each sub-block being associated to at least onemotion vector, and determining for the predictor candidate, one or morecorresponding control point associated to the predictor candidate, basedon motion information associated to the predictor candidate comprisesdetermining, for at least two distinct sets of at least three sub-blocksof the predictor candidate, one or more corresponding control pointmotion vectors for the predictor candidate associated respectively tothe at least two sets, based on the motion vectors associatedrespectively to the at least three sub-blocks of each set, andcalculating one or more corresponding control point motion vectorsassociated to the predictor candidate by averaging the determined one ormore corresponding control point motion vectors associated to each set.

According to this embodiment, multiple sets of one or more correspondingcontrol point motion vectors are determined for the predictor candidatebased on motion vectors associated to sub-blocks of the predictorcandidate. Multiple distinct sets of sub-blocks are used. The one ormore corresponding control point motion vectors for the predictorcandidate are then calculated by averaging the determined one or morecorresponding control point motion vectors from each sets.

According to at least one embodiment, a control point generator motionvector (v_(x), v_(y)) at position (x, y) is determined from one or morecorresponding control point motion vectors associated to the predictorcandidate by:

$\left\{ \begin{matrix}{v_{x} = {{\frac{\left( {v_{1x} - v_{0x}} \right)}{w}x} - {\frac{\left( {v_{1y} - v_{0y}} \right)}{w}y} + v_{0x}}} \\{v_{y} = {{\frac{\left( {v_{1y} - v_{0y}} \right)}{w}x} + {\frac{\left( {v_{1x} - v_{0x}} \right)}{w}y} + v_{0y}}}\end{matrix} \right.\quad$

wherein (v_(0x), v_(0y)) corresponds to the control point motion vectorof the top-left corner of the predictor candidate, (v_(1x), v_(1y))corresponds to the control point motion vector of the top-right cornerof the predictor candidate, w is the width of the predictor candidate.

According to at least one embodiment, motion information associated topredictor candidate is derived from:

-   -   a bilateral template matching between two reference blocks in        respectively two reference frames,    -   a reference block of a reference frame identified by motion        information of a first spatial neighboring block of the        predictor candidate,    -   an average of motion vectors of spatial and temporal neighboring        blocks of the predictor candidate.

According to another general aspect of at least one embodiment, anon-transitory computer readable medium is presented containing datacontent generated according to the method or the apparatus of any of thepreceding descriptions.

According to another general aspect of at least one embodiment, a signalis provided comprising video data generated according to the method orthe apparatus of any of the preceding descriptions.

One or more of the present embodiments also provide a computer readablestorage medium having stored thereon instructions for encoding ordecoding video data according to any of the methods described above. Thepresent embodiments also provide a computer readable storage mediumhaving stored thereon a bitstream generated according to the methodsdescribed above. The present embodiments also provide a method andapparatus for transmitting the bitstream generated according to themethods described above. The present embodiments also provide a computerprogram product including instructions for performing any of the methodsdescribed.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates a block diagram of an embodiment of an HEVC (HighEfficiency Video Coding) video encoder.

FIG. 2A is a pictorial example depicting HEVC reference samplegeneration.

FIG. 2B is a pictorial example depicting motion vector prediction inHEVC.

FIG. 3 illustrates a block diagram of an embodiment of an HEVC videodecoder.

FIG. 4 illustrates an example of Coding Tree Unit (CTU) and Coding Tree(CT) concepts to represent a compressed HEVC picture.

FIG. 5 illustrates an example of divisions of a Coding Tree Unit (CTU)into Coding Units (CUs), Prediction Units (PUs), and Transform Units(TUs).

FIG. 6 illustrates an example of an affine model as the motion modelused in Joint Exploration Model (JEM).

FIG. 7 illustrates an example of 4×4 sub-CU based affine motion vectorfield used in Joint Exploration Model (JEM).

FIG. 8A illustrates an example of motion vector prediction candidatesfor Affine Inter CUs.

FIG. 8B illustrates an example of motion vector prediction candidates inthe Affine Merge mode.

FIG. 9 illustrates an example of spatial derivation of affine controlpoint motion vectors in the case of Affine Merge mode motion model.

FIG. 10 illustrates an example method according to a general aspect ofat least one embodiment.

FIG. 11 illustrates another example method according to a general aspectof at least one embodiment.

FIG. 12 also illustrates another example method according to a generalaspect of at least one embodiment.

FIG. 13 also illustrates another example method according to a generalaspect of at least one embodiment.

FIG. 14 illustrates an example of a process/syntax for evaluating theAffine Merge mode of an inter-CU according to a general aspect of atleast one embodiment.

FIG. 15 illustrates an example of a process/syntax for determining apredictor candidate in an Affine Merge mode according to a generalaspect of at least one embodiment.

FIG. 16 illustrates an example of a process/syntax for determining oneor more corresponding control point generator motion vectors for apredictor candidate having a non-affine motion model.

FIG. 17A illustrates an example of a process/syntax for determining oneor more corresponding control point motion vectors for a predictorcandidate having a non-affine motion model.

FIG. 17B illustrates another example of a process/syntax for determiningone or more corresponding control point motion vectors for a predictorcandidate having a non-affine motion model.

FIG. 18 illustrates an example of a predictor candidate selectionprocess/syntax according to a general aspect of at least one embodiment.

FIG. 19 illustrates an example of a process/syntax to build a set ofmultiple predictor candidates according to a general aspect of at leastone embodiment.

FIG. 20 illustrates an example of a derivation process/syntax oftop-left and top-right corner CPMVs for each predictor candidateaccording to a general aspect of at least one embodiment.

FIG. 21 illustrates an example of a CU coded using a bilateral templatematching between two reference blocks in respectively two referenceframes.

FIG. 22 illustrates an example of a CU coded using a bilateral templatematching divided into sub-blocks.

FIG. 23 illustrates an example of a CU coded in an ATMVP mode of theJEM.

FIG. 24 illustrates an example of a CU coded in an STMVP mode of theJEM.

FIG. 25 illustrates a block diagram of an example apparatus in whichvarious aspects of the embodiments may be implemented.

DETAILED DESCRIPTION

FIG. 1 illustrates an exemplary High Efficiency Video Coding (HEVC)encoder 100. HEVC is a compression standard developed by JointCollaborative Team on Video Coding (JCT-VC) (see, e.g., “ITU-T H.265TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (October 2014), SERIESH: AUDIOVISUAL AND MULTIMEDIA SYSTEMS, Infrastructure of audiovisualservices—Coding of moving video, High efficiency video coding,Recommendation ITU-T H.265”).

In HEVC, to encode a video sequence with one or more pictures, a pictureis partitioned into one or more slices where each slice can include oneor more slice segments. A slice segment is organized into coding units,prediction units, and transform units.

In the present application, the terms “reconstructed” and “decoded” maybe used interchangeably, the terms “encoded” or “coded” may be usedinterchangeable, and the terms “picture” and “frame” may be usedinterchangeably. Usually, but not necessarily, the term “reconstructed”is used at the encoder side while “decoded” is used at the decoder side.

The HEVC specification distinguishes between “blocks” and “units,” wherea “block” addresses a specific area in a sample array (e.g., luma, Y),and the “unit” includes the collocated blocks of all encoded colorcomponents (Y, Cb, Cr, or monochrome), syntax elements, and predictiondata that are associated with the blocks (e.g., motion vectors).

For coding, a picture is partitioned into coding tree blocks (CTB) ofsquare shape with a configurable size, and a consecutive set of codingtree blocks is grouped into a slice. A Coding Tree Unit (CTU) containsthe CTBs of the encoded color components. A CTB is the root of aquadtree partitioning into Coding Blocks (CB), and a Coding Block may bepartitioned into one or more Prediction Blocks (PB) and forms the rootof a quadtree partitioning into Transform Blocks (TBs). Corresponding tothe Coding Block, Prediction Block, and Transform Block, a Coding Unit(CU) includes the Prediction Units (PUs) and the tree-structured set ofTransform Units (TUs), a PU includes the prediction information for allcolor components, and a TU includes residual coding syntax structure foreach color component. The size of a CB, PB, and TB of the luma componentapplies to the corresponding CU, PU, and TU. In the present application,the term “block” can be used to refer, for example, to any of CTU, CU,PU, TU, CB, PB, and TB. In addition, the “block” can also be used torefer to a macroblock and a partition as specified in H.264/AVC or othervideo coding standards, and more generally to refer to an array of dataof various sizes.

In the exemplary encoder 100, a picture is encoded by the encoderelements as described below. The picture to be encoded is processed inunits of CUs. Each CU is encoded using either an intra or inter mode.When a CU is encoded in an intra mode, it performs intra prediction(160). In an inter mode, motion estimation (175) and compensation (170)are performed. The encoder decides (105) which one of the intra mode orinter mode to use for encoding the CU, and indicates the intra/interdecision by a prediction mode flag. Prediction residuals are calculatedby subtracting (110) the predicted block from the original image block.

CUs in intra mode are predicted from reconstructed neighboring sampleswithin the same slice. A set of 35 intra prediction modes is availablein HEVC, including a DC, a planar, and 33 angular prediction modes. Theintra prediction reference is reconstructed from the row and columnadjacent to the current block. The reference extends over two times theblock size in the horizontal and vertical directions using availablesamples from previously reconstructed blocks. When an angular predictionmode is used for intra prediction, reference samples can be copied alongthe direction indicated by the angular prediction mode.

The applicable luma intra prediction mode for the current block can becoded using two different options. If the applicable mode is included ina constructed list of three most probable modes (MPM), the mode issignaled by an index in the MPM list. Otherwise, the mode is signaled bya fixed-length binarization of the mode index. The three most probablemodes are derived from the intra prediction modes of the top and leftneighboring blocks.

For an inter CU, the corresponding coding block is further partitionedinto one or more prediction blocks. Inter prediction is performed on thePB level, and the corresponding PU contains the information about howinter prediction is performed. The motion information (i.e., motionvector and reference picture index) can be signaled in two methods,namely, “merge mode” and “advanced motion vector prediction (AMVP)”.

In the merge mode, a video encoder or decoder assembles a candidate listbased on already coded blocks, and the video encoder signals an indexfor one of the candidates in the candidate list. At the decoder side,the motion vector (MV) and the reference picture index are reconstructedbased on the signaled candidate.

The set of possible candidates in the merge mode consists of spatialneighbor candidates, a temporal candidate, and generated candidates.FIG. 2A shows the positions of five spatial candidates {a₁, b₁, b₀, a₀,b₂} for a current block 210, wherein a₀ and a₁ are to the left of thecurrent block, and b₁, b₀, b₂ are at the top of the current block. Foreach candidate position, the availability is checked according to theorder of a₁, b₁, b₀, a₀, b₂, and then the redundancy in candidates isremoved.

The motion vector of the collocated location in a reference picture canbe used for derivation of a temporal candidate. The applicable referencepicture is selected on a slice basis and indicated in the slice header,and the reference index for the temporal candidate is set to i_(ref)=0.If the POC distance (td) between the picture of the collocated PU andthe reference picture from which the collocated PU is predicted from, isthe same as the distance (tb) between the current picture and thereference picture containing the collocated PU, the collocated motionvector mv_(col) can be directly used as the temporal candidate.Otherwise, a scaled motion vector, tb/td*mv_(col), is used as thetemporal candidate. Depending on where the current PU is located, thecollocated PU is determined by the sample location at the bottom-rightor at the center of the current PU.

The maximum number of merge candidates, N, is specified in the sliceheader. If the number of merge candidates is larger than N, only thefirst N−1 spatial candidates and the temporal candidate are used.Otherwise, if the number of merge candidates is less than N, the set ofcandidates is filled up to the maximum number N with generatedcandidates as combinations of already present candidates, or nullcandidates. The candidates used in the merge mode may be referred to as“merge candidates” in the present application.

If a CU indicates a skip mode, the applicable index for the mergecandidate is indicated only if the list of merge candidates is largerthan 1, and no further information is coded for the CU. In the skipmode, the motion vector is applied without a residual update.

In AMVP, a video encoder or decoder assembles candidate lists based onmotion vectors determined from already coded blocks. The video encoderthen signals an index in the candidate list to identify a motion vectorpredictor (MVP) and signals a motion vector difference (MVD). At thedecoder side, the motion vector (MV) is reconstructed as MVP+MVD. Theapplicable reference picture index is also explicitly coded in the PUsyntax for AMVP.

Only two spatial motion candidates are chosen in AMVP. The first spatialmotion candidate is chosen from left positions {a₀, a₁} and the secondone from the above positions {b₀, b₁, b₂}, while keeping the searchingorder as indicated in the two sets. If the number of motion vectorcandidates is not equal to two, the temporal MV candidate can beincluded. If the set of candidates is still not fully filled, then zeromotion vectors are used.

If the reference picture index of a spatial candidate corresponds to thereference picture index for the current PU (i.e., using the samereference picture index or both using long-term reference pictures,independently of the reference picture list), the spatial candidatemotion vector is used directly. Otherwise, if both reference picturesare short-term ones, the candidate motion vector is scaled according tothe distance (tb) between the current picture and the reference pictureof the current PU and the distance (td) between the current picture andthe reference picture of the spatial candidate. The candidates used inthe AMVP mode may be referred to as “AMVP candidates” in the presentapplication.

For ease of notation, a block tested with the “merge” mode at theencoder side or a block decoded with the “merge” mode at the decoderside is denoted as a “merge” block, and a block tested with the AMVPmode at the encoder side or a block decoded with the AMVP mode at thedecoder side is denoted as an “AMVP” block.

FIG. 2B illustrates an exemplary motion vector representation usingAMVP. For a current block 240 to be encoded, a motion vector(MV_(current)) can be obtained through motion estimation. Using themotion vector (MV_(left)) from a left block 230 and the motion vector(MV_(above)) from the above block 220, a motion vector predictor can bechosen from MV_(left) and MV_(above) as MVP_(current). A motion vectordifference then can be calculated asMVD_(current)=MV_(current)−MVP_(current).

Motion compensation prediction can be performed using one or tworeference pictures for prediction. In P slices, only a single predictionreference can be used for Inter prediction, enabling uni-prediction fora prediction block. In B slices, two reference picture lists are toavailable, and uni-prediction or bi-prediction can be used. Inbi-prediction, one reference picture from each of the reference picturelists is used.

In HEVC, the precision of the motion information for motion compensationis one quarter-sample (also referred to as quarter-pel or ¼-pel) for theluma component and one eighth-sample (also referred to as ⅛-pel) for thechroma components for the 4:2:0 configuration. A 7-tap or 8-tapinterpolation filter is used for interpolation of fractional-samplepositions, i.e., ¼, ½ and ¾ of full sample locations in both horizontaland vertical directions can be addressed for luma.

The prediction residuals are then transformed (125) and quantized (130).The quantized transform coefficients, as well as motion vectors andother syntax elements, are entropy coded (145) to output a bitstream.The encoder may also skip the transform and apply quantization directlyto the non-transformed residual signal on a 4×4 TU basis. The encodermay also bypass both transform and quantization, i.e., the residual iscoded directly without the application of the transform or quantizationprocess. In direct PCM coding, no prediction is applied and the codingunit samples are directly coded into the bitstream.

The encoder decodes an encoded block to provide a reference for furtherpredictions. The quantized transform coefficients are de-quantized (140)and inverse transformed (150) to decode prediction residuals. Combining(155) the decoded prediction residuals and the predicted block, an imageblock is reconstructed. In-loop filters (165) are applied to thereconstructed picture, for example, to perform deblocking/SAO (SampleAdaptive Offset) filtering to reduce encoding artifacts. The filteredimage is stored at a reference picture buffer (180).

FIG. 3 illustrates a block diagram of an exemplary HEVC video decoder300. In the exemplary decoder 300, a bitstream is decoded by the decoderelements as described below. Video decoder 300 generally performs adecoding pass reciprocal to the encoding pass as described in FIG. 1,which performs video decoding as part of encoding video data.

In particular, the input of the decoder includes a video bitstream,which may be generated by video encoder 100. The bitstream is firstentropy decoded (330) to obtain transform coefficients, motion vectors,and other coded information. The transform coefficients are de-quantized(340) and inverse transformed (350) to decode the prediction residuals.Combining (355) the decoded prediction residuals and the predictedblock, an image block is reconstructed. The predicted block may beobtained (370) from intra prediction (360) or motion-compensatedprediction (i.e., inter prediction) (375). As described above, AMVP andmerge mode techniques may be used to derive motion vectors for motioncompensation, which may use interpolation filters to calculateinterpolated values for sub-integer samples of a reference block.In-loop filters (365) are applied to the reconstructed image. Thefiltered image is stored at a reference picture buffer (380).

As mentioned, in HEVC, motion compensated temporal prediction isemployed to exploit the redundancy that exists between successivepictures of a video. To do that, a motion vector is associated with eachprediction unit (PU). As explained above, each CTU is represented by aCoding Tree in the compressed domain. This is a quad-tree division ofthe CTU, where each leaf is called a Coding Unit (CU) and is alsoillustrated in FIG. 4 for CTUs 410 and 420. Each CU is then given someIntra or Inter prediction parameters as prediction information. To doso, a CU may be spatially partitioned into one or more Prediction Units(PUs), each PU being assigned some prediction information. The Intra orInter coding mode is assigned on the CU level. These concepts arefurther illustrated in FIG. 5 for an exemplary CTU 500 and a CU 510.

In HEVC, one motion vector is assigned to each PU. This motion vector isused for motion compensated temporal prediction of the considered PU.Therefore, in HEVC, the motion model that links a predicted block andits reference block simply consists of a translation or calculationbased on the reference block and the corresponding motion vector.

To make improvements to HEVC, the reference software and/ordocumentation JEM (Joint Exploration Model) is being developed by theJoint Video Exploration Team (JVET). In one JEM version (e.g.,“Algorithm Description of Joint Exploration Test Model 5”, DocumentJVET-E1001 v2, Joint Video Exploration Team of ISO/IEC JTC1/SC29/WG11,5rd meeting, 12-20 Jan. 2017, Geneva, CH), some further motion modelsare supported to improve temporal prediction. To do so, a PU can bespatially divided into sub-PUs and a model can be used to assign eachsub-PU a dedicated motion vector.

In more recent versions of the JEM (e.g., “Algorithm Description ofJoint Exploration Test Model 2”, Document JVET-B1001 v3, Joint VideoExploration Team of ISO/IEC JTC1/SC29/WG11, 2rd meeting, 20-26 Feb.2016, San Diego, USA”), a CU is no longer specified to be divided intoPUs or TUs. Instead, more flexible CU sizes may be used, and some motiondata are directly assigned to each CU. In this new codec design underthe newer versions of JEM, a CU may be divided into sub-CUs and a motionvector may be computed for each sub-CU of the divided CU.

One of the new motion models introduced in the JEM is the use of anaffine model as the motion model to represent the motion vectors in aCU. The motion model used is illustrated by FIG. 6 and is represented byEquation 1 as shown below. The affine motion field comprises thefollowing motion vector component values for each position (x, y) insidethe considered block 600 of FIG. 6:

$\begin{matrix}\begin{matrix}{{affine}\mspace{14mu} {motion}\mspace{14mu} {model}\mspace{14mu} {used}\mspace{14mu} {to}\mspace{14mu} {generate}\mspace{14mu} {the}\mspace{14mu} {motion}} & \;\end{matrix} & \; \\{{{field}\mspace{14mu} {inside}\mspace{14mu} a\mspace{14mu} {CU}\mspace{14mu} {for}\mspace{14mu} {prediction}},} & \; \\\left\{ \begin{matrix}{v_{x} = {{\frac{\left( {v_{1x} - v_{0x}} \right)}{w}x} - {\frac{\left( {v_{1y} - v_{0y}} \right)}{w}y} + v_{0x}}} \\{v_{y} = {{\frac{\left( {v_{1y} - v_{0y}} \right)}{w}x} + {\frac{\left( {v_{1x} - v_{0x}} \right)}{w}y} + v_{0y}}}\end{matrix} \right. & {{Equation}\mspace{14mu} 1}\end{matrix}$

wherein (v_(0x), v_(0y)) and (v_(1x), v_(1y)) are the control pointmotion vectors used to generate the corresponding motion field, v_(0x),v_(0y)) corresponds to the control point motion vector of the top-leftcorner of the block being encoded or decoded, (v_(1x), v_(1y))corresponds to the control point motion vector of the top-right cornerof the block being encoded or decoded, and w is the width of the blockbeing encoded or decoded.

To reduce complexity, a motion vector is computed for each 4×4 sub-block(sub-CU) of the considered CU 700, as illustrated in FIG. 7. An affinemotion vector is computed from the control point motion vectors, foreach center position of each sub-block. The obtained MV is representedat 1/16 pel accuracy. As a result, the compensation of a coding unit inthe affine mode consists in motion compensated prediction of eachsub-block with its own motion vector. These motion vectors for thesub-blocks are shown respectively as an arrow for each of the sub-blocksin FIG. 7.

Affine motion compensation may be used in 2 ways in the JEM: AffineInter (AF_AMVP) mode and Affine Merge mode. They are introduced in thefollowing sections.

Affine Inter (AF_AMVP) mode: A CU in AMVP mode, whose size is largerthan 8×8, may be predicted in Affine Inter mode. This is signaledthrough a flag in the bit-stream. The generation of the Affine MotionField for that CU includes determining control point motion vectors(CPMVs), which are obtained by the decoder through the addition of amotion vector differential and a control point motion vector prediction(CPMVP). The CPMVPs are a pair of motion vector candidates, respectivelytaken from the set (A, B, C) and (D, E) illustrated in FIG. 8A for acurrent CU 800 being encoded or decoded.

Affine Merge mode: In Affine Merge mode, a CU-level flag indicates if amerge CU employs affine motion compensation. If so, then the firstavailable neighboring CU that has been coded in an Affine mode isselected among the ordered set of candidate positions A, B, C, D, E ofFIG. 8B for a current CU 880 being encoded or decoded. Note that thisordered set of candidate positions in JEM is the same as the spatialneighbor candidates in the merge mode in HEVC as shown in FIG. 2A and asexplained previously.

Once the first neighboring CU in Affine mode is obtained, then the 3CPMVs {right arrow over (v₂)}, {right arrow over (v₃)}, and {right arrowover (v₄)} from the top-left, top-right and bottom-left corners of theneighboring affine CU are retrieved or calculated. For example, FIG. 9shows that this first determined neighboring CU 910 in Affine mode beingin the A position of FIG. 8B for a current CU 900 being encoded ordecoded. Based on these three CPMVs of the neighboring CU 910, the twoCPMVs of the top-left and top-right corners of the current CU 900 arederived as follows:

$\begin{matrix}\begin{matrix}{\mspace{85mu} {{derivation}\mspace{14mu} {of}\mspace{14mu} {CPMVs}\mspace{14mu} {of}\mspace{14mu} {the}\mspace{14mu} {current}\mspace{14mu} {CU}\mspace{14mu} {based}}} \\{\mspace{79mu} {{on}\mspace{14mu} {the}\mspace{14mu} {three}\mspace{14mu} {control}\text{-}{point}\mspace{14mu} {motion}\mspace{14mu} {vectors}}} \\{\mspace{79mu} {{of}\mspace{14mu} {the}\mspace{14mu} {selected}\mspace{14mu} {neighboring}\mspace{14mu} {CU}}}\end{matrix} & \; \\{\overset{\rightarrow}{v_{0}} = {\overset{\rightarrow}{v_{2}} + {\left( {\overset{\rightarrow}{v_{4}} - \overset{\rightarrow}{v_{2}}} \right)\left( \frac{Y_{curr} - Y_{neighb}}{H_{neighb}} \right)} + {\left( {\overset{\rightarrow}{v_{3}} - \overset{\rightarrow}{v_{2}}} \right)\left( \frac{X_{curr} - X_{neighb}}{W_{neighb}} \right)}}} & {{Equation}\mspace{14mu} 2} \\{\mspace{79mu} {\overset{\rightarrow}{v_{1}} = {\overset{\rightarrow}{v_{0}} + {\left( {\overset{\rightarrow}{v_{3}} - \overset{\rightarrow}{v_{2}}} \right)\left( \frac{W_{curr}}{W_{neighb}} \right)}}}} & \;\end{matrix}$

where Y_(curr), Y_(neighb) are respectively the vertical position of thecurrent CU and the selected neighboring CU in the picture, X_(curr),X_(neighb) are respectively the horizontal position of the current CUand the selected neighboring CU in the picture, W_(curr) is thehorizontal size of the current CU and W_(neighb), H_(neighb) arerespectively the horizontal and vertical size of the selectedneighboring CU.

When the control point motion vectors {right arrow over (v₀)} and {rightarrow over (v₁)} of the current CU are obtained, the motion field insidethe current CU being encoded or decoded is computed on a 4×4 sub-CUbasis, through the model of Equation 1 as described above in connectionwith FIG. 6.

Accordingly, a general aspect of at least one embodiment aims to improvethe performance of the Affine Merge mode in JEM so that the compressionperformance of a considered video codec may be improved. Therefore, inat least one embodiment, an augmented and improved affine motioncompensation apparatus and method are presented, for example, for CodingUnits that are coded in Affine Merge mode. The proposed augmented andimproved affine mode includes evaluating predictor candidates which arenot coded using an Affine Merge mode or an Affine Inter mode.

As discussed before, in the current JEM, the first neighboring CU codedin Affine Merge mode among the surrounding CUs is selected to predictthe affine motion model associated with the current CU being encoded ordecoded. That is, the first neighboring CU candidate among the orderedset (A, B, C, D, E) of FIG. 8B that is coded in affine mode is selectedto predict the affine motion model of current CU. In the case where noneof the neighboring CU candidate is coded in affine mode, no predictionis available for the affine motion model of the current CU.

Accordingly, in at least one embodiment, a predictor candidate isdetermined for coding a current CU in Affine Merge mode. Such apredictor candidate being associated to motion information correspondingto a non affine motion model. For instance, such a predictor candidatemay correspond to a previously coded CU in a non affine mode, i.e atranslational model as known from HEVC.

Accordingly, in at least one embodiment, the predictor candidate isselected from a set of predictor candidates. The set of predictorcandidates may comprise CU previously coded using an affine mode and CUpreviously coded using a non affine mode. The predictor candidate fromthe set of predictor candidates that provides the best coding efficiencywhen coding the current CU in Affine Merge mode, is selected. Theimprovements of this embodiment, at a general level, therefore comprise,for example:

-   -   constructing a set of multiple predictor candidates that is        likely to provide a good set of candidates for the prediction of        an affine motion model of a CU (for encoder/decoder);    -   selecting one predictor for the current CU's control point        motion vector among the constructed set (for encoder/decoder);        and/or    -   signaling/decoding the index of current CU's control point        motion vector predictor (for encoder/decoder).

Accordingly, FIG. 10 illustrates an exemplary encoding method 1000according to a general aspect of at least one embodiment. At 1010, themethod 1000 determines, for a block being encoded in a picture, apredictor candidate. The predictor candidate is associated to motioninformation. That is, the predictor candidate has been previously codedin an INTER mode by any method based on motion compensation predictionusing the associated motion information. According to at least oneembodiment, the predictor candidate verifies a predetermined criterion,for example the predictor candidate is associated to motion informationthat is close to an affine motion model.

At 1020, the method 1000 determines, for the predictor candidate, one ormore corresponding control point generator motion vectors, based onmotion information associated to the predictor candidate. Furtherdetails for such determining are given below in respect with FIG. 16. At1030, the method 1000 determines for the block being encoded, one ormore corresponding control point motion vectors, based on the one ormore corresponding control point generator motion vectors determined forthe predictor candidate. At 1040, the method 1000 determines, based onthe one or more corresponding control point motion vectors determinedfor the block, a corresponding motion field, wherein the correspondingmotion field identifies motion vectors used for prediction of sub-blocksof the block being encoded. At 1050, the method 1000 encodes the blockbased on the corresponding motion field.

FIG. 11 illustrates an exemplary encoding method 1100 according to ageneral aspect of at least one embodiment. At 1110, the method 1100determines, for a block being encoded in a picture, a set of predictorcandidates. According to at least one embodiment, al least one predictorcandidate from the set of predictor candidates verifies a predeterminedcriterion, for example the at least one predictor candidate isassociated to motion information that is close to an affine motionmodel.

At 1120, the method 1100 selects a predictor candidate among the set ofpredictor candidates. At 1130, the method 1100 determines, for theselected predictor candidate, one or more corresponding control pointgenerator motion vectors, based on motion information associated to theselected predictor candidate. Further details for such determining aregiven below in respect with FIG. 16. At 1140, the method 1100determines, for the block being encoded, one or more correspondingcontrol point motion vectors, based on the one or more correspondingcontrol point generator motion vectors determined for the selectedpredictor candidate. At 1150, the method 1100 determines, based on theone or more corresponding control point motion vectors determined forthe block, a corresponding motion field, wherein the correspondingmotion field identifies motion vectors used for prediction of sub-blocksof the block being encoded. At 1160, the method 1100 evaluates theselected predictor candidates according to one or more criteria andbased on the corresponding motion field. For example, the method 1100estimates a rate-distortion cost for encoding the block using the motionfield determined at 1150 et stores the rate-distortion cost inassociation with the selected predictor candidate. At 1160, if all thepredictor candidates of the set of predictor candidates have beenevaluated, the method 1100 passes to 1170. If one or more of thepredictor candidates of the set of predictor candidates have not beenevaluated, the method 1100 passes to 1120 to select a new predictorcandidate from the set of predictor candidates. At 1170, the method 1100selects a predictor candidate from the set of predictor candidates basedon the evaluating. For example, the predictor candidate that providesthe lowest rate-distortion cost for the block being encoded is selected.At 1180, the method 1100 encodes the block based on the predictorcandidate selected at 1170. At 1190, the method 1100 encodes an indexthe predictor candidate selected at 1170. This index is used by thedecoder to retrieve the predictor candidate from the set of predictorcandidates.

FIG. 12 illustrates an exemplary decoding method 1200 according to ageneral aspect of at least one embodiment. At 1210, the method 1200determines, for a block being decoded in a picture, a predictorcandidate. The predictor candidate is associated to motion information.That is, the predictor candidate has been previously decoded andreconstructed by any method based on motion compensation predictionusing the associated motion information. At 1220, the method 1200determines, for the predictor candidate, one or more correspondingcontrol point generator motion vectors, based on motion informationassociated to the predictor candidate. Further details for suchdetermining are given below in respect with FIG. 16. At 1230, the method1200 determines for the block being decoded, one or more correspondingcontrol point motion vectors, based on the one or more correspondingcontrol point generator motion vectors determined for the predictorcandidate. At 1240, the method 1200 determines, based on the one or morecorresponding control point motion vectors determined for the block, acorresponding motion field, wherein the corresponding motion fieldidentifies motion vectors used for prediction of sub-blocks of the blockbeing decoded. At 1250, the method 1200 decodes the block based on thecorresponding motion field.

FIG. 13 illustrates an exemplary decoding method 1300 according to ageneral aspect of at least one embodiment. At 1310, the method 1300determines, for a block being decoded in a picture, a set of predictorcandidates. At 1320, the method 1300 receives, for a block being decodedin a picture, an index corresponding to a particular predictor candidatein the set of predictor candidates. In various embodiments, theparticular predictor candidate has been selected at an encoder, and theindex allows one of multiple predictor candidates to be selected. Themethod 1300 selects a predictor candidate in the set of predictorcandidate using the received index. At 1330, the method 1300 determines,for the selected predictor candidate, one or more corresponding controlpoint generator motion vectors, based on motion information associatedto the selected predictor candidate. Further details for suchdetermining are given below in respect with FIG. 16. At 1340, the method1300 determines for the block being decoded, one or more correspondingcontrol point motion vectors, based on the one or more correspondingcontrol point generator motion vectors determined for the selectedpredictor candidate. At 1350, the method 1300 determines, based on theone or more corresponding control point motion vectors determined forthe block, a corresponding motion field, wherein the correspondingmotion field identifies motion vectors used for prediction of sub-blocksof the block being decoded. At 1360, the method 1300 decodes the blockbased on the corresponding motion field.

FIG. 14 illustrates the detail of an embodiment of a process/syntax 1400used to predict the affine motion field of a current CU being encoded ordecoded in the existing Affine Merge mode in JEM. The input 1401 to thisprocess/syntax 1400 is the current Coding Unit for which one wants togenerate the affine motion field of the sub-blocks as shown in FIG. 7.At 1410, the Affine Merge CPMVs for the current block are obtained withthe selected predictor candidate as explained above in connection with,e.g., FIG. 6, FIG. 7, FIG. 8B, and FIG. 9. The derivation of thispredictor candidate is also explained in more detail later with respectto FIG. 15 according to at least one embodiment.

As a result, at 1420, the top-left and top-right control point motionvectors {right arrow over (v₀)} and {right arrow over (v₁)} are thenused to compute the affine motion field associated with the current CU.This consists in computing a motion vector for each 4×4 sub-blockaccording to Equation 1 as explained before. At 1430 and 1440, once themotion field is obtained for the current CU, the temporal prediction ofthe current CU takes place, involving 4×4 sub-block based motioncompensation and then OBMC (Overlapped Block Motion Compensation). At1450 and 1460, the current CU is coded and reconstructed, successivelywith and without residual data. For example, at 1450, the current CU isfirst coded using intra mode with no residual coding. At 1460, the bestway to encode the current CU (e.g., the way having minimum ratedistortion cost), is then selected, which provides the coding of thecurrent CU in the Affine Merge mode. The Affine Merge coding mode isthen put in a rate distortion (RD) competition with other coding modes(including e.g., inter mode with residual coding) available for thecurrent CU in the considered video coding system. A mode is selectedbased on the RD competition, and that mode is used to encode the currentCU, and an index for that mode is also encoded in various embodiments.

In at least one implementation, a residual flag is used. At 1450, a flagis activated indicating that the coding is done with residual data. At1460, the current CU is fully coded and reconstructed (with residual)giving the corresponding RD cost. Then the flag is deactivatedindicating that the coding is done without residual data, and theprocess goes back to 1460 where the CU is coded (without residual)giving the corresponding RD cost. The lowest RD cost between the twoprevious ones indicates if residual must be coded or not (normal orskip). Then this best RD cost is put in competition with other codingmodes. Rate distortion determination will be explained in more detailbelow.

FIG. 15 shows the detail of an embodiment of a process/syntax 1500 usedto predict the one or more control points of the current CU's affinemotion field. This consists in searching a CU among the spatialpositions (A, B, C, D, E) of FIG. 8B that is suitable for deriving forthe current CU one or more control points of an affine motion model.Such a suitable CU may verify a predetermined criterion, for examplesuch a CU has been previously encoded using motion information that isclose to an affine motion model. Examples of such coding modes are givenlater with respect to FIG. 20A-B, 21-23.

The spatial positions (A, B, C, D, E) of FIG. 8B are evaluated insequential order, and the first position that corresponds to a CUverifying the predetermined criterion is selected. The process/syntax1500 then consists in computing control point motion vectors for thecurrent CU that will be used later to generate the affine motion fieldassigned to the current CU to encode. This control point computationproceeds as follows. The CU that contains the selected position isdetermined. It is one of the neighbor CUs of current CU as explainedbefore. Next, the 3 CPMVs {right arrow over (v₂)}, {right arrow over(v₃)}, and {right arrow over (v₄)} from the top-left, top-right andbottom-left corners inside the selected neighbor CU are retrieved ordetermined. For simplicity, here, the 3 CPMVs {right arrow over (v₂)},{right arrow over (v₃)}, and {right arrow over (v₄)} are called controlpoint generator motion vectors. If the control point generator motionvectors have not yet been determined, and if the neighbor CU is not inan affine mode, the control point generator motion vectors for theneighbor CU are determined as explained in connection with FIG. 16. Ifthe control point generator motion vectors have already been determinedfor the selected neighbor CU, the 3 CPMVs {right arrow over (v₂)},{right arrow over (v₃)}, and {right arrow over (v₄)} are retrieved. Ifthe selected neighbor CU is in an affine mode, the control pointgenerator motion vectors (CPMVs {right arrow over (v₂)}, {right arrowover (v₃)}, and {right arrow over (v₄)}) are determined from thetop-left and top-right CPMVs {right arrow over (v₀)}, and {right arrowover (v₁)} of the selected neighbor CU using Equation 1. Finally, thetop-left and top-right CPMVs {right arrow over (v₀)}, and {right arrowover (v₁)} of the current CU are derived from the 3 CPMVs {right arrowover (v₂)}, {right arrow over (v₃)}, and {right arrow over (v₄)},according to Equation 2, as explained before in connection with FIG. 9.

According to at least one embodiment, FIG. 16 shows a method 1600 fordetermining for a predictor candidate, one or more corresponding controlpoint generator motion vectors, based on motion information associatedto the predictor candidate. At 1610, the method 1600 determines one ormore corresponding control point associated to the predictor candidate,based on motion information associated to the predictor candidate.Further details are given below with respect to FIG. 17. At 1620, themethod 1600 determines the one or more corresponding control pointgenerator motion vectors from the one or more corresponding controlpoint associated to the predictor candidate.

According to an embodiment, the one or more corresponding control pointgenerator motion vectors of the predictor candidate comprise a motionvector of a top left corner of the predictor candidate, a motion vector{right arrow over (v₃)} of an above right corner of the predictorcandidate, and a motion vector {right arrow over (v₄)} of a left bottomcorner of the predictor candidate. The CPMV {right arrow over (v₀)} and{right arrow over (v₁)} of the current CU to encode are determined usingEquation 2.

According to at least one embodiment, FIG. 17A shows a method 1700 fordetermining one or more corresponding control point associated to apredictor candidate, based on motion information associated to thepredictor candidate. According to an embodiment, the predictor candidatecomprises one or more sub-blocks, each sub-block being associated to atleast one motion vector. At 1710, the method 1700 determines one or morecorresponding control point motion vectors associated to the predictorcandidate, based on the at least two motion vectors associatedrespectively to at least two sub-blocks of the predictor candidate. Forexample, if we denote {right arrow over (v_(s0))} a motion vector of afirst sub-block of the predictor candidate, {right arrow over (v_(sw))}a motion vector of the last sub-block of the first line of the predictorcandidate, {right arrow over (v_(sh))} a motion vector of the lastsub-block of the first column of the predictor candidate, and {rightarrow over (v_(swh))} a motion vector of the sub-block of the last lineand last column of the predictor candidate, the one or morecorresponding control point motion vectors for the predictor candidatemay be set to {right arrow over (v_(s0))} for the top-left corner pointmotion vector of the predictor candidate ({right arrow over (v₀)}) andto {right arrow over (v_(sw))} for the top-right corner point motionvector of the predictor candidate ({right arrow over (v₁)}). At 1720,the method 1700 verifies that the one or more corresponding controlpoint motion vectors associated to the predictor candidate satisfies anaffine motion model. For example, the method 1700 estimates the motionvectors {right arrow over (v_(sh))}′ and {right arrow over (v_(swh))}′using Equation 1 and {right arrow over (v_(s0))} and {right arrow over(v_(sh))} as CMPVs of the predictor candidate. At 1720, the method 1700then compare the estimated motion vectors {right arrow over (v_(sh))}′and {right arrow over (v_(swh))}′ with the motion vectors {right arrowover (v_(sh))} and {right arrow over (v_(swh))} associated to thecorresponding sub-blocks. If the respective motion vectors are close inangles and norm values, the one or more corresponding control pointmotion vectors determined at 1710 is close to an affine motion mode, andthe predictor candidate associated to the one or more correspondingcontrol point motion vectors determined at 1710 can be used as apredictor candidate for a block coded in an affine mode.

The respective motion vectors are close if the absolute differencebetween their norm is below a threshold and the angle between the twomotion vectors is below another threshold, wherein the thresholds may befixed values, for example one pixel for the norm, or 45° for the angle,or a value set according to the motion vector precision, for example 4times the precision of the vector, or a value set according to the sizeof the motion vector.

According to at least one embodiment, FIG. 17B shows a method 1700′ fordetermining one or more corresponding control point associated to apredictor candidate, based on motion information associated to thepredictor candidate. According to an embodiment, the predictor candidatecomprises one or more sub-blocks, each sub-block being associated to atleast one motion vector. At 1710′, the method 1700′ determines, for atleast two distinct sets of at least three sub-blocks of the predictorcandidate, one or more corresponding control point motion vectors forthe predictor candidate associated respectively to the at least twosets, based on the motion vectors associated respectively to the atleast three sub-blocks of each set. For example, as illustrated on FIG.22, a CU (Cur block) comprises four sub-blocks s₀, s_(w), s_(h) ands_(wh), sub-block being respectively associated to a motion vector{right arrow over (v_(s0))}, {right arrow over (v_(sw))}, {right arrowover (v_(sh))} and {right arrow over (v_(swh))}. Multiple distinct setsof sub-blocks may be defined as (s₀, s_(w), S_(h)), (s₀, s_(w), s_(wh)),(s₀, s_(h), s_(wh)), and (s_(w), s_(h), s_(wh)) For each set ofsub-blocks, one or more corresponding control point motion vectors areestimated for the predictor candidate using Equation 1. That is, at1710′, the parameters v_(0x), v_(0y), v_(1x) and v_(1y) of Equation 1are determined for each set using the motion vectors associatedrespectively the sub-blocks of the set. Multiple sets of parameters{p_(0x), v_(0y), v_(1x), v_(1y)} are thus obtained. At 1720′, the method1700′ calculates one or more corresponding control point motion vectorsassociated to the predictor candidate by averaging the one or morecorresponding control point motion vectors associated to each setdetermined at 1710′. That is the parameters v_(0x), v_(0y), v_(1x) andv_(1y) for the predictor candidate are obtained by averaging theparameters v_(0x), v_(0y), v_(1x) and v_(1y) obtained from each set ofsub-blocks. According to another variant, the parameters v_(0x), v_(0y),v_(1x) and v_(1y) for the predictor candidate may be obtained as themedian of the parameters v_(0x), v_(0y), v_(1x) and v_(1y) from each setof sub-blocks.

One general aspect of at least one embodiment consists in selecting abetter motion predictor candidate to derive the CPMVs of a current CUbeing encoded or decoded, among a set of multiple predictor candidates.On the encoder side, the candidate used to predict the current CPMVs ischosen according to a rate distortion cost criteria, according to oneaspect of one exemplary embodiment. Its index is then coded in theoutput bit-stream for the decoder, according to another aspect ofanother exemplary embodiment. The decoder, then, receives and decodesthe index corresponding to the selected candidate from the bit-stream toderive the corresponding relevant data.

According to another aspect of another exemplary embodiment, CPMVs usedherewith are not limited to the two at the top-right and top-leftpositions of the current CU being coded or decoded, as shown in FIG. 6.Other embodiments comprise, e.g., only one vector or more than twovectors, and the positions of these CPMVs are e.g., at other cornerpositions, or at any positions in or out of the current block, as longas it is possible to derive a motion field such as, e.g., at theposition(s) of the center of the corner 4×4 sub-blocks, or the internalcorner of the corner 4×4 sub-blocks.

In an exemplary embodiment, the set of potential candidate predictorsbeing investigated is identical to the set of positions (A, B, C, D, E)used to retrieve the CPMV predictor in the existing Affine Merge mode inJEM, as illustrated in FIG. 8B. FIG. 18 illustrates the details of oneexemplary selection process/syntax 1800 for selecting the best candidateto predict a current CU's affine motion model according to a generalaspect of this embodiment. However, other embodiments use a set ofpredictor positions that is different from A, B, C, D, E, and that caninclude fewer or more elements in the set.

As shown at 1801, the input to this exemplary embodiment 1800 is alsoinformation of the current CU being encoded or decoded. At 1810, a setof multiple predictor candidates is built, according to the algorithm1900 of FIG. 19, which is explained below. Algorithm 1900 of FIG. 19includes gathering all neighboring positions (A, B, C, D, E) shown inFIG. 8A that corresponds to a past CU which satisfied the predeterminedcriterion explained with FIG. 15, into a set of candidates for theprediction of current CU affine motion. Thus, instead of stopping when apast CU satisfying the predetermined criterion is found as in FIG. 15,the process/syntax 1800 stores all possible candidates for the currentCU.

Once the process of FIG. 19 is done as shown at 1810 of FIG. 18, theprocess/syntax 1800 of FIG. 18, at 1820, computes the top-left andtop-right corner CPMVs predicted from each candidate of the set providedat 1810. This process of 1820 is further detailed and illustrated byFIG. 20.

Again, FIG. 20 shows the detail of 1820 in FIG. 18 and includes a loopover each candidate determined and found from the preceding step (1810of FIG. 18). For each predictor candidate in the set of predictorcandidates, the CU that contains the spatial position of that candidateis determined. Then for each reference list L0 and L1 (in the case of aB slice), the control point motion vectors {right arrow over (v₀)} and{right arrow over (v₁)} useful to produce the current CU's motion fieldare derived according to Equation 2, using the CPMVs {right arrow over(v₂)}, {right arrow over (v₃)} and {right arrow over (v₄)} of thedetermined CU. If the determined CU is not in an affine mode, the CPMVs{right arrow over (v₂)}, {right arrow over (v₃)} and {right arrow over(v₄)} of the determined CU are determined as explained with respect toFIG. 16. The two CPMVs for each candidate are then stored in the set ofcandidate CPMVs.

Once the process of FIG. 20 is done and the process is returned to FIG.18, a loop 1830 over each Affine Merge predictor candidate is performed.It may select, for example, the CPMV candidate that leads to the lowestrate distortion cost. Inside the loop 1830 over each candidate, anotherloop 1840 which is similar to the process as shown on FIG. 14 is used tocode the current CU with each CPMV candidate as explained before. Thealgorithm of FIG. 14 ends when all candidates have been evaluated, andits output may comprise the index of the best predictor. As indicatedbefore, as an example, the candidate with the minimum rate distortioncost may be selected as the best predictor. Various embodiments use thebest predictor to encode the current CU, and certain embodiments alsoencode an index for the best predictor.

One example of a determination of the rate distortion cost is defined asfollows, as is well known to a person skilled in the art:

RD _(cost) =D+λ×R

wherein D represents the distortion (typically a L2 distance) betweenthe original block and a reconstructed block obtained by encoding anddecoding the current CU with the considered candidate; R represents therate cost, e.g. the number of bits generated by coding the current blockwith the considered candidate; λ is the Lagrange parameter, whichrepresents the rate target at which the video sequence is being encoded.

One advantage of the exemplary candidate set extension methods describedin this application is an increase in the variety in the set ofcandidate Control Point Motion Vectors that may be used to construct theaffine motion field associated with a given CU. Thus, the presentembodiments provide technological advancement in the computingtechnology of video content encoding and decoding. For example, thepresent embodiments improve the rate distortion performance provided bythe Affine Merge coding mode in JEM. This way, the overall ratedistortion performance of the considered video codec has been improved.

Also, according to another general aspect of at least one embodiment,the Affine Inter mode as described before may also be improved with allof the current teachings presented herewith by having an extended listof predictor candidates. As described above in connection with FIG. 8A,one or more CPMVPs of an Affine Inter CU are derived from neighboringmotion vectors regardless of their coding mode. As disclosed in FIG. 17,it is possible to derive CPMV's of a neighboring CU satisfying thepredetermined criterion. That is, according to the method explained withrespect to FIG. 17, for a neighboring CU that is associated to motioninformation close to an affine motion model, it is possible to deriveestimated CPMVs for that neighboring CU. Therefore, it is then possibleto take advantage of the neighbors which have a motion close to anaffine motion model to construct the one or more CPMVPs of the currentAffine Inter CU, as in Affine Merge mode as described before. In thatcase, the considered candidates may be the same list as described abovefor Affine Merge mode (e.g., not limited to only spatial candidates).

Accordingly, a set of predictor candidates are provided to improvecompression/decompression being provided by the current HEVC and JEM byusing more predictor candidates. The process will be more efficient andcoding gain will be observed even if it may be needed to transmit asupplemental index.

According to embodiments explained with respect to FIG. 10-13, at leastone predictor candidate that is selected satisfies a predeterminedcriterion. Such a predictor candidate is associated to motioninformation that is close to an affine motion model, even though thepredictor candidate is not in an affine mode.

FIG. 21-24 illustrates coding modes that may provide predictorcandidates satisfying the predetermined criterion.

According to an embodiment, a predictor candidate satisfies thepredetermined criterion if the predictor candidate is associated tomotion information derived from:

-   -   a bilateral template matching between two reference blocks in        respectively two reference frames, or    -   a reference block of a reference frame identified by motion        information of a first spatial neighboring block of the        predictor candidate, or    -   an average of motion vectors of spatial and temporal neighboring        blocks of the predictor candidate.

FIG. 21 illustrates a current CU (Cur block) of a picture (Cur Pic)predicted using a bilateral template matching between two referenceblocks in respectively two reference frames (Ref0, Ref1). The motionvector of the current CU is refined according to the bilateral templatematching cost minimization. As illustrated in FIG. 22, this current CUis then divided into smaller sub-blocks (s₀, s_(w), s_(h), s_(wh)) andthe motion vector for each sub-block is further refined with thebilateral template matching cost independently at the sub-block level.

Either at the CU or at the sub-block level, the templates are defined asthe reference blocks in reference frames as shown on FIG. 21. The firsttemplate is obtained through a candidate motion vector referring to areference frame from a particular reference frame list (for example,with MVO on reference frame 0 of reference frame list 0). The secondtemplate is obtained in a reference frame from the other reference framelist (on reference frame 0 of reference frame list 1) with a scaledversion of the candidate motion vector (MV1) so that the motiontrajectory goes through the current CU in the current frame. Theassociated bilateral template matching cost is then the SAD betweenthese two reference blocks (templates).

According to the bilateral template matching, since a CU coded using thebilateral mode has a slightly different motion vector for each of itssub-blocks, the motion vectors of the sub-blocks can be interpreted as amotion field. In some cases, this motion field could be close to anaffine motion field. It is then possible to estimate a nearest affinemotion field for that CU, i.e. the closest affine model with its CPMV,so that the estimated CPMVs could be used as predictor candidate topredict an affine coding mode. Estimating a nearest affine motion fieldfor the CU coded in the bilateral template matching may be performed asexplained with respect to FIG. 17.

FIG. 23 illustrates a current CU of a picture (Cur Pic) predicted usinga reference block of a reference frame (Ref0) identified by motioninformation of a first spatial neighboring block of the predictorcandidate. Such a coding mode is also known as the ATMVP (forAlternative Temporal Motion Vector Prediction) in the JEM. The ATMVPcandidate aims at reproducing the partitioning observed in the referenceframe Ref0 at a position given by the first spatial candidate from themerge predictor list. The first Merge (spatial) candidate gives a motionvector and a current reference frame (for example, MV0 and Ref0 on FIG.23). The partitions observed in the corresponding compensated block arecopied to the current one and the associated motion vectors are scaledaccording to the current reference frame. The copied partitions can comefrom one or several CUs that can be coded with any modes. Thus, when thepartitions come from an Affine mode, a Template and/or a FRUC BilateralCUs, it is possible to estimate and store the corresponding Affine model(CPMV) and then to use the estimated Affine model as a predictor for anAffine coding mode. Estimating the Affine model for the CU coded inATMVP may be performed as explained with respect to FIG. 17. For a CUcoded in ATMVP, the partitions of the CU may be divided into 4×4sub-blocks and each 4×4 sub-blocks is associated to motion informationof the partition to which it belongs. The process disclosed in FIG. 17is then performed on each of the 4×4 sub-blocks of the CU to obtain anaffine model for the CU.

In JEM, a CU may be coded in an STMVP mode wherein motion informationfor that CU is derived as an average of motion vectors of spatial andtemporal neighboring blocks of the CU. The STMVP candidate performs anaverage of spatial and temporal neighboring motion vectors at a 4×4sub-block level as shown on FIG. 24. The motion vector of each sub-blockis defined as the average of the top and left spatial neighboring motionvectors and of the bottom-right temporal motion vector. For example,motion vector of the A sub-block is the average of the spatial b and cmotion vectors and of the D temporal motion vector. If the surroundingand the temporal neighbors come from Affine, Template and/or FRUCBilateral CUs, the STMVP motion vectors will be closed to an affinemotion field. Then, it could be possible to estimate and store acorresponding Affine model (CPMV) for the CU coded in STMVP as explainedin FIG. 17 and to use it as a predictor for an Affine coding mode.

FIG. 25 illustrates a block diagram of an exemplary system 2500 in whichvarious aspects of the exemplary embodiments may be implemented. Thesystem 2500 may be embodied as a device including the various componentsdescribed below and is configured to perform the processes describedabove. Examples of such devices, include, but are not limited to,personal computers, laptop computers, smartphones, tablet computers,digital multimedia set top boxes, digital television receivers, personalvideo recording systems, connected home appliances, and servers. Thesystem 2500 may be communicatively coupled to other similar systems, andto a display via a communication channel as shown in FIG. 25 and asknown by those skilled in the art to implement all or part of theexemplary video systems described above.

Various embodiments of the system 2500 include at least one processor2510 configured to execute instructions loaded therein for implementingthe various processes as discussed above. The processor 2510 may includeembedded memory, input output interface, and various other circuitriesas known in the art. The system 2500 may also include at least onememory 2520 (e.g., a volatile memory device, a non-volatile memorydevice). The system 2500 may additionally include a storage device 2540,which may include non-volatile memory, including, but not limited to,EEPROM, ROM, PROM, RAM, DRAM, SRAM, flash, magnetic disk drive, and/oroptical disk drive. The storage device 2540 may comprise an internalstorage device, an attached storage device, and/or a network accessiblestorage device, as non-limiting examples. The system 2500 may alsoinclude an encoder/decoder module 2530 configured to process data toprovide encoded video and/or decoded video, and the encoder/decodermodule 2530 may include its own processor and memory.

The encoder/decoder module 2530 represents the module(s) that may beincluded in a device to perform the encoding and/or decoding functions.As is known, such a device may include one or both of the encoding anddecoding modules. Additionally, the encoder/decoder module 2530 may beimplemented as a separate element of the system 2500 or may beincorporated within one or more processors 2510 as a combination ofhardware and software as known to those skilled in the art.

Program code to be loaded onto one or more processors 2510 to performthe various processes described hereinabove may be stored in the storagedevice 2540 and subsequently loaded onto the memory 2520 for executionby the processors 2510. In accordance with the exemplary embodiments,one or more of the processor(s) 2510, the memory 2520, the storagedevice 2540, and the encoder/decoder module 2530 may store one or moreof the various items during the performance of the processes discussedherein above, including, but not limited to the input video, the decodedvideo, the bitstream, equations, formulas, matrices, variables,operations, and operational logic.

The system 2500 may also include a communication interface 2550 thatenables communication with other devices via a communication channel2560. The communication interface 2550 may include, but is not limitedto a transceiver configured to transmit and receive data from thecommunication channel 2560. The communication interface 2550 mayinclude, but is not limited to, a modem or network card and thecommunication channel 2550 may be implemented within a wired and/orwireless medium. The various components of the system 2500 may beconnected or communicatively coupled together (not shown in FIG. 25)using various suitable connections, including, but not limited tointernal buses, wires, and printed circuit boards.

The exemplary embodiments may be carried out by computer softwareimplemented by the processor 2510 or by hardware, or by a combination ofhardware and software. As a non-limiting example, the exemplaryembodiments may be implemented by one or more integrated circuits. Thememory 2520 may be of any type appropriate to the technical environmentand may be implemented using any appropriate data storage technology,such as optical memory devices, magnetic memory devices,semiconductor-based memory devices, fixed memory, and removable memory,as non-limiting examples. The processor 2510 may be of any typeappropriate to the technical environment, and may encompass one or moreof microprocessors, general purpose computers, special purposecomputers, and processors based on a multi-core architecture, asnon-limiting examples.

The implementations described herein may be implemented in, for example,a method or a process, an apparatus, a software program, a data stream,or a signal. Even if only discussed in the context of a single form ofimplementation (for example, discussed only as a method), theimplementation of features discussed may also be implemented in otherforms (for example, an apparatus or a program). An apparatus may beimplemented in, for example, appropriate hardware, software, andfirmware. The methods may be implemented in, for example, an apparatussuch as, for example, a processor, which refers to processing devices ingeneral, including, for example, a computer, a microprocessor, anintegrated circuit, or a programmable logic device. Processors alsoinclude communication devices, such as, for example, computers, cellphones, portable/personal digital assistants (“PDAs”), and other devicesthat facilitate communication of information between end-users.

Furthermore, one skilled in the art may readily appreciate that theexemplary HEVC encoder 100 shown in FIG. 1 and the exemplary HEVCdecoder shown in FIG. 3 may be modified according to the above teachingsof the present disclosure in order to implement the disclosedimprovements to the exiting HEVC standards for achieving bettercompression/decompression. For example, entropy coding 145, motioncompensation 170, and motion estimation 175 in the exemplary encoder 100of FIG. 1, and entropy decoding 330, and motion compensation 375, in theexemplary decoder of FIG. 3 may be modified according to the disclosedteachings to implement one or more exemplary aspects of the presentdisclosure including providing an enhanced affine merge prediction tothe existing JEM.

Reference to “one embodiment” or “an embodiment” or “one implementation”or “an implementation”, as well as other variations thereof, mean that aparticular feature, structure, characteristic, and so forth described inconnection with the embodiment is included in at least one embodiment.Thus, the appearances of the phrase “in one embodiment” or “in anembodiment” or “in one implementation” or “in an implementation”, aswell any other variations, appearing in various places throughout thespecification are not necessarily all referring to the same embodiment.

Additionally, this application or its claims may refer to “determining”various pieces of information. Determining the information may includeone or more of, for example, estimating the information, calculating theinformation, predicting the information, or retrieving the informationfrom memory.

Further, this application or its claims may refer to “accessing” variouspieces of information. Accessing the information may include one or moreof, for example, receiving the information, retrieving the information(for example, from memory), storing the information, processing theinformation, transmitting the information, moving the information,copying the information, erasing the information, calculating theinformation, determining the information, predicting the information, orestimating the information.

Additionally, this application or its claims may refer to “receiving”various pieces of information. Receiving is, as with “accessing”,intended to be a broad term. Receiving the information may include oneor more of, for example, accessing the information, or retrieving theinformation (for example, from memory). Further, “receiving” istypically involved, in one way or another, during operations such as,for example, storing the information, processing the information,transmitting the information, moving the information, copying theinformation, erasing the information, calculating the information,determining the information, predicting the information, or estimatingthe information.

As will be evident to one of skill in the art, implementations mayproduce a variety of signals formatted to carry information that may be,for example, stored or transmitted. The information may include, forexample, instructions for performing a method, or data produced by oneof the described implementations. For example, a signal may be formattedto carry the bitstream of a described embodiment. Such a signal may beformatted, for example, as an electromagnetic wave (for example, using aradio frequency portion of spectrum) or as a baseband signal. Theformatting may include, for example, encoding a data stream andmodulating a carrier with the encoded data stream. The information thatthe signal carries may be, for example, analog or digital information.The signal may be transmitted over a variety of different wired orwireless links, as is known. The signal may be stored on aprocessor-readable medium.

1-15. (canceled)
 16. A method for video decoding, comprising:determining, for a block being decoded in a picture, a predictorcandidate; determining for the predictor candidate, one or morecorresponding control point generator motion vectors, based on motioninformation associated to the at least one predictor candidate;determining for the block being decoded, one or more correspondingcontrol point motion vectors, based on the one or more correspondingcontrol point generator motion vectors determined for the predictorcandidate; determining, based on the one or more corresponding controlpoint motion vectors determined for the block, a corresponding motionfield, wherein the corresponding motion field identifies motion vectorsused for prediction of sub-blocks of the block being decoded; decodingthe block based on the corresponding motion field.
 17. The methodaccording to claim 16, wherein the predictor candidate is comprised in aset of predictor candidates and wherein determining, for the block beingdecoded, the predictor candidate comprises receiving an indexcorresponding to the predictor candidate in the set of predictorcandidates.
 18. The method of claim 16, wherein motion informationassociated to the predictor candidate corresponds to a translationalmotion information.
 19. The method of claim 16, wherein determining forthe predictor candidate, one or more corresponding control pointgenerator motion vectors, based on motion information associated to thepredictor candidate, comprises: determining one or more correspondingcontrol point associated to the predictor candidate, based on motioninformation associated to the predictor candidate, determining the oneor more corresponding control point generator motion vectors from theone or more corresponding control point associated to the predictorcandidate.
 20. The method of claim 19, wherein the one or morecorresponding control point generator motion vectors comprise a motionvector {right arrow over (v₂)} of a top left corner of the predictorcandidate, a motion vector {right arrow over (v₃)} of an above rightcorner of the predictor candidate, and a motion vector {right arrow over(v₄)} of a left bottom corner of the predictor candidate, and whereinthe one or more corresponding control point motion vectors for the blockcomprise a motion vector {right arrow over (v₀)} of a top left corner ofthe block and a motion vector {right arrow over (v₁)} of an above rightcorner of the block, and wherein motion vector {right arrow over (v₀)}and {right arrow over (v₁)} are determined by:$\overset{\rightarrow}{v_{0}} = {\overset{\rightarrow}{v_{2}} + {\left( {\overset{\rightarrow}{v_{4}} - \overset{\rightarrow}{v_{2}}} \right)\left( \frac{Y_{curr} - Y_{neighb}}{H_{neighb}} \right)} + {\left( {\overset{\rightarrow}{v_{3}} - \overset{\rightarrow}{v_{2}}} \right)\left( \frac{X_{curr} - X_{neighb}}{W_{neighb}} \right)}}$${\overset{\rightarrow}{v_{1}} = {\overset{\rightarrow}{v_{0}} + {\left( {\overset{\rightarrow}{v_{3}} - \overset{\rightarrow}{v_{2}}} \right)\left( \frac{W_{curr}}{W_{neighb}} \right)}}},$where Y_(curr), Y_(neighb) are respectively the vertical position of theblock and the predictor candidate in the picture, X_(curr), X_(neighb)are respectively the horizontal position of the block and the predictorcandidate in the picture, W_(curr) is the horizontal size of the blockand W_(neighb), H_(neighb) are respectively the horizontal and verticalsize of the predictor candidate.
 21. The method of claim 19, wherein thepredictor candidate comprises one or more sub-blocks, each sub-blockbeing associated to at least one motion vector, and wherein determiningfor the predictor candidate, one or more corresponding control pointassociated to the predictor candidate, based on motion informationassociated to the predictor candidate comprises: determining one or morecorresponding control point motion vectors associated to the predictorcandidate, based on at least two motion vectors associated respectivelyto at least two sub-blocks of the predictor candidate, verifying thatthe one or more corresponding control point motion vectors associated tothe predictor candidate satisfies an affine motion model.
 22. The methodof claim 19, wherein the predictor candidate comprises one or moresub-blocks, each sub-block being associated to at least one motionvector, and wherein determining for the predictor candidate, one or morecorresponding control point associated to the predictor candidate, basedon motion information associated to the predictor candidate comprises:determining, for at least two distinct sets of at least three sub-blocksof the predictor candidate, one or more corresponding control pointmotion vectors for the predictor candidate associated respectively tothe at least two sets, based on the motion vectors associatedrespectively to the at least three sub-blocks of each set, calculatingone or more corresponding control point motion vectors associated to thepredictor candidate by averaging the determined one or morecorresponding control point motion vectors associated to each set. 23.The method of claim 16, wherein motion information associated topredictor candidate is derived from at least one of: a bilateraltemplate matching between two reference blocks in respectively tworeference frames, a reference block of a reference frame identified bymotion information of a first spatial neighboring block of the predictorcandidate, an average of motion vectors of spatial and temporalneighboring blocks of the predictor candidate.
 24. An apparatus forvideo decoding, comprising a memory and at least one processorconfigured for: determining, for a block being decoded in a picture, apredictor candidate; determining for the predictor candidate, one ormore corresponding control point generator motion vectors, based onmotion information associated to the at least one predictor candidate;determining for the block being decoded, one or more correspondingcontrol point motion vectors, based on the one or more correspondingcontrol point generator motion vectors determined for the predictorcandidate; determining, based on the one or more corresponding controlpoint motion vectors determined for the block, a corresponding motionfield, wherein the corresponding motion field identifies motion vectorsused for prediction of sub-blocks of the block being decoded; decodingthe block based on the corresponding motion field.
 25. A method forvideo encoding, comprising: determining, for a block being encoded in apicture, at least one predictor candidate; determining for the at leastone predictor candidate, one or more corresponding control pointgenerator motion vectors, based on motion information associated to theat least one predictor candidate; determining for the block beingencoded, one or more corresponding control point motion vectors, basedon the one or more corresponding control point generator motion vectorsdetermined for the at least one predictor candidate; determining, basedon the one or more corresponding control point motion vectors determinedfor the block, a corresponding motion field, wherein the correspondingmotion field identifies motion vectors used for prediction of sub-blocksof the block being encoded; encoding the block based on thecorresponding motion field.
 26. The encoding method of claim 25, whereinthe at least one predictor candidate is comprised in a set of predictorcandidates, the encoding method further comprising: encoding an indexfor the at least one predictor candidate from the set of predictorcandidates.
 27. The method of claim 25, wherein motion informationassociated to the predictor candidate corresponds to a translationalmotion information.
 28. The method of claim 25, wherein determining forthe predictor candidate, one or more corresponding control pointgenerator motion vectors, based on motion information associated to thepredictor candidate, comprises: determining one or more correspondingcontrol point associated to the predictor candidate, based on motioninformation associated to the predictor candidate, determining the oneor more corresponding control point generator motion vectors from theone or more corresponding control point associated to the predictorcandidate.
 29. The method of claim 28, wherein the one or morecorresponding control point generator motion vectors comprise a motionvector {right arrow over (v₂)} of a top left corner of the predictorcandidate, a motion vector {right arrow over (v₃)} of an above rightcorner of the predictor candidate, and a motion vector {right arrow over(v₄)} of a left bottom corner of the predictor candidate, and whereinthe one or more corresponding control point motion vectors for the blockcomprise a motion vector {right arrow over (v₀)} of a top left corner ofthe block and a motion vector {right arrow over (v₁)} of an above rightcorner of the block, and wherein motion vector {right arrow over (v₀)}and {right arrow over (v₁)} are determined by:$\overset{\rightarrow}{v_{0}} = {\overset{\rightarrow}{v_{2}} + {\left( {\overset{\rightarrow}{v_{4}} - \overset{\rightarrow}{v_{2}}} \right)\left( \frac{Y_{curr} - Y_{neighb}}{H_{neighb}} \right)} + {\left( {\overset{\rightarrow}{v_{3}} - \overset{\rightarrow}{v_{2}}} \right)\left( \frac{X_{curr} - X_{neighb}}{W_{neighb}} \right)}}$${\overset{\rightarrow}{v_{1}} = {\overset{\rightarrow}{v_{0}} + {\left( {\overset{\rightarrow}{v_{3}} - \overset{\rightarrow}{v_{2}}} \right)\left( \frac{W_{curr}}{W_{neighb}} \right)}}},$where Y_(curr), Y_(neighb) are respectively the vertical position of theblock and the predictor candidate in the picture, X_(curr), X_(neighb)are respectively the horizontal position of the block and the predictorcandidate in the picture, W_(curr) is the horizontal size of the blockand W_(neighb), H_(neighb) are respectively the horizontal and verticalsize of the predictor candidate.
 30. The method of claim 28, wherein thepredictor candidate comprises one or more sub-blocks, each sub-blockbeing associated to at least one motion vector, and wherein determiningfor the predictor candidate, one or more corresponding control pointassociated to the predictor candidate, based on motion informationassociated to the predictor candidate comprises: determining one or morecorresponding control point motion vectors associated to the predictorcandidate, based on at least two motion vectors associated respectivelyto at least two sub-blocks of the predictor candidate, verifying thatthe one or more corresponding control point motion vectors associated tothe predictor candidate satisfies an affine motion model.
 31. The methodof claim 28, wherein the predictor candidate comprises one or moresub-blocks, each sub-block being associated to at least one motionvector, and wherein determining for the predictor candidate, one or morecorresponding control point associated to the predictor candidate, basedon motion information associated to the predictor candidate comprises:determining, for at least two distinct sets of at least three sub-blocksof the predictor candidate, one or more corresponding control pointmotion vectors for the predictor candidate associated respectively tothe at least two sets, based on the motion vectors associatedrespectively to the at least three sub-blocks of each set, calculatingone or more corresponding control point motion vectors associated to thepredictor candidate by averaging the determined one or morecorresponding control point motion vectors associated to each set. 32.The method of claim 25, wherein motion information associated topredictor candidate is derived from at least one of: a bilateraltemplate matching between two reference blocks in respectively tworeference frames, a reference block of a reference frame identified bymotion information of a first spatial neighboring block of the predictorcandidate, an average of motion vectors of spatial and temporalneighboring blocks of the predictor candidate.
 33. An apparatus forvideo encoding, comprising a memory and at least one processorconfigured for: determining, for a block being encoded in a picture, atleast one predictor candidate; determining for the at least onepredictor candidate, one or more corresponding control point generatormotion vectors, based on motion information associated to the at leastone predictor candidate; determining for the block being encoded, one ormore corresponding control point motion vectors, based on the one ormore corresponding control point generator motion vectors determined forthe at least one predictor candidate; determining, based on the one ormore corresponding control point motion vectors determined for theblock, a corresponding motion field, wherein the corresponding motionfield identifies motion vectors used for prediction of sub-blocks of theblock being encoded; encoding the block based on the correspondingmotion field.
 34. A non-transitory computer readable medium containingdata content generated according to the method of claim
 25. 35. Anon-transitory computer readable medium containing data contentgenerated according to the apparatus of claim
 33. 36. A computerreadable storage medium having stored thereon instructions for decodingvideo data according to the method of claim
 16. 37. A computer readablestorage medium having stored thereon instructions for encoding videodata according to the method of claim 25.